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Abstract: An abundant literature 
dealing with the population genet- 
ics and taxonomy of Giardia duo- 
denalis, Cryptosporidium spp., Pneu- 
mocystis spp., and Cryptococcus 
spp., pathogens of high medical 
and veterinary relevance, has been 
produced in recent years. We have 
analyzed these data in the light of 
new population genetic concepts 
dealing with predominant clonal 
evolution (PCE) recently proposed 
by us. In spite of the considerable 
phylogenetic diversity that exists 
among these pathogens, we have 
found striking similarities among 
them. The two main PCE features 
described by us, namely highly 
significant linkage disequilibrium 
and near-clading (stable phyloge- 
netic clustering clouded by occa- 
sional recombination), are clearly 
observed in Cryptococcus and Giar- 
dia, and more limited indication of 
them is also present in Cryptospo- 
ridium and Pneumocystis. Moreover, 
in several cases, these features still 
obtain when the near-clades that 
subdivide the species are analyzed 
separately ("Russian doll pattern"). 
Lastly, several sets of data under- 
mine the notion that certain mi- 
crobes form clonal lineages simply 
owing to a lack of opportunity to 
outcross due to low transmission 
rates leading to lack of multiclonal 
infections ("starving sex hypothe- 
sis"). We propose that the diver- 
gent taxonomic and population 
genetic inferences advanced by 
various authors about these path- 
ogens may not correspond to true 
evolutionary differences and could 
be, rather, the reflection of idiosyn- 
cratic practices among compart- 
mentalized scientific communities. 
The PCE model provides an oppor- 
tunity to revise the taxonomy and 
applied research dealing with these 
pathogens and others, such as 
viruses, bacteria, parasitic protozoa, 
and fungi. 



Introduction: The Model of 
Predominant Clonal Evolution 
(PCE) 

The PCE model [1,2] defines clonal 
evolution as scarcity or absence of 
genetic recombination, a definition that 
is accepted by most authors working on 
pathogen population genetics [3], includ- 
ing the species here surveyed [4-9]. The 
PCE model [3,10,11] (i) does not pre- 
sume that recombination is absent 
[12,13] or plays a minor evolutionary 
role, but that it is too rare to break the 
prevalent pattern of clonality; (ii) ad- 
dresses each species as a whole, and not 
their genetic subdivisions considered 
individually [14]; and (iii) definitely 
includes selfing/inbreeding/homogamy 
(which lead to restrained recombination) 
as particular cases of PCE, rather than as 
distinct evolutionary models [1-3,10,11, 
1 5] . This view is shared by many authors 
working on the pathogens here analyzed 
[12,16-19] and by others [20]. A few 
authors [21,22] prefer to limit the 
concept of clonality to "strict" clonality 
(i.e., mitotic propagation) and consider 
that it should be distinguished from 
selfing/inbreeding/"unisex." This is a 
matter of definition. It is nevertheless 
worth noting that in the examples cited 
in [21], differently from the authors of 
the article, all scientists working on 
parthenogenesis in insects, amphibians, 
fishes, and reptiles definitely include 
parthenogenesis in clonality. 



As we have exposed extensively [1—3], 
biases that could lead to wrong conclu- 
sions of restrained recombination (mainly 
isolation by distance and/or time or 
Wahlund effect) should be carefully con- 
sidered before concluding a PCE pattern. 

Lastly, as we have insisted in [3], the 
PCE model states that restrained recom- 
bination is mainly due to built-in proper- 
ties of microbes, rather than to the 
downstream elimination of most possible 
recombinants by natural selection and 
epistasis phenomena. If natural selection 
were the main factor that would maintain 
clonality, it would be at unacceptable costs 
for the organisms considered, because this 
would mean that most of the offspring is 
eliminated at each generation. Natural 
selection certainly acts on microbes, as it 
does on any organism. However, our 
proposal is that it cannot be the main 
factor responsible for PCE in organisms 
that would be otherwise potentially pan- 
mictic. 

Recent Developments 

We have recently proposed new insights 
about PCE, applicable to all kinds of 
micropathogens (including viruses, bacte- 
ria, parasites, and fungi) [3] and, more 
specifically, to Trypanosoma and Leishmania 
[10] and to Plasmodium and Toxoplasma 
[1 1]. We have proposed replacing subjec- 
tive and imprecise assertions such as 
"recombination at a high rate" [14] 
or "gross incongruences" [23] with a 
clear-cut PCE definition relying on two 
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Author Summary 

Micropathogen species definition is extremely difficult, since concepts applied to 
higher organisms (the biological species concept) are inadequate. In particular, 
the pathogens here surveyed have given rise to long-lasting controversies about 
their species status and that of the genotypes that subdivide them. The 
population genetic approach based on the predominant clonal evolution (PCE) 
concept proposed by us could bring simple solutions to these controversies, since 
it permits the description of clearly defined evolutionary entities (clonal 
multilocus genotypes and near-clades [incompletely isolated clades]) that could 
be the basis for species description, if the concerned specialists find it justified for 
applied research. The PCE model also provides a convenient framework for 
applied studies (molecular epidemiology, vaccine and drug design, clinical 
research) dealing with these pathogens and others. 



complementary criteria: (i) statistically 
significant linkage disequilibrium (LD), or 
nonrandom association of genotypes oc- 
curring at different loci, and (ii) growing 
phylogenetic signal when more reliable 
data are added. Lastly, we have discussed 
the possibility of distinguishing PCE from 
cryptic biological speciation. We have also 
distinguished clonality by lack of available 
mating partners (due to scarcity of multi- 
clonal infections) from built-in clonality. 

LD is the very statistic that permits one 
to evidence lack of recombination, the 
basic definition of PCE. Contrary to 
segregation tests, LD analysis does not 
require that the organism under survey is 
diploid, nor does it require knowledge of 
ploidy [3]. This is highly relevant when 
micropathogens are concerned [3] since 
widespread aneuploidy seems to be very 
frequent in them, including in fungi, 
Trypanosoma, and Leishmania [12], which 
renders tests based on diploidy invalid. 
When a sufficient set of loci is analyzed, 
LD is a very powerful statistic [1]. 

One has to ascertain that LD cannot be 
explained by trivial physical obstacles 
(isolation by space or time: the Wahlund 
effect) [2] . It is widely used as circumstan- 
tial evidence for PCE by authors working 
on the pathogens here considered [7,24- 
26] and by others [27,28]. A telling 
consequence of LD is the spread of stable 
multilocus genotypes (MLGs) over vast 
time and space scales [3]. However, this 
pattern depends on the rate of evolution 
(molecular clock) of the marker considered 
and might not be observed with fast- 
evolving markers such as microsatellites, 
even in the case of strong linkage disequi- 
librium [3]. 

The criterion of a growing phylogenetic 
signal when more adequate data are 
added relies on the congruence principle 
[29], which states that if the working 
hypothesis is correct, evidence increases as 
more data are considered. For example, 
when a set of Multilocus Sequence Typing 



(MLST) data are considered, although 
some discrepancies can be observed be- 
tween individual gene trees, the phyloge- 
netic signal gets stronger and stronger 
when more loci are included in the 
combined tree. Or, the genetic distances 
calculated from different molecular mark- 
ers are strongly correlated (the "g" test 
[1]). If the impact of recombination were 
stronger than clonal propagation in the 
long run, the contrary would obtain. This 
approach, relying on congruence, may not 
be verified when inadequate data are 
compared, such as, for example, markers 
with different molecular clocks or under- 
going different selective pressures or dif- 
ferent evolutionary tendencies. This could 
lead to wrong assertions of recombination 
[10]. The main manifestation of this 
growing phylogenetic signal is the exis- 
tence of genetic subdivisions that are stable 
in space and time ("near-clades" [3]). The 
term "clade" [26,30,31] is not adequate 
when micropathogens are concerned, 
because even when PCE obtains, some 
residual recombination can always occur 
[3]. 

We have differentiated PCE from 
cryptic speciation. It has been inferred 
that apparent clonality could be explained 
by the fact that the species under study is 
subdivided into discrete genetic clusters, 
among which recombination is inhibited 
while it is not within them [32]. Such a 
model amounts to equating these genetic 
subdivisions to cryptic biological species. 
To distinguish this case from PCE, we 
have proposed [10] the "Russian doll 
model." If the PCE criteria are uncovered, 
not only at the level of the whole species 
but also within its genetic subdivisions, it 
favors PCE rather than cryptic speciation. 
In this case, the genetic subdivisions of the 
species show a miniature picture of the 
whole species, with LD and lesser near- 
clades (Figure 1). However, this approach 
should be conveniently applied by select- 
ing markers with an adequate resolution 



power (molecular clock). As a matter of 
fact, when addressing lesser genetic subdi- 
visions rather than the whole species, one 
changes evolutionary scales. If the resolu- 
tion of the markers is not consequently 
adapted, lack of PCE signal could be due 
to a statistical type II error (lack of 
resolution). For the same reason, the 
sampling size should not become too 
small. 

We have also discussed apparent clon- 
ality by lack of available mating partners 
in low transmission cycles. To explain 
apparent manifestations of clonality in 
Plasmodium falciparum [1,2], it has been 
proposed that selfing/inbreeding occurred 
"mechanically" in low transmission areas 
because mixed infections of different 
genotypes are rare, which makes outcross- 
ing impossible [33]. We have called this 
model the "starving sex hypothesis" and 
have shown that it was frequently at odds 
with the available data in P. falciparum as 
well as in P. vivax [11]. The alternative 
hypothesis [1 1] is that restrained recom- 
bination by selfmg, inbreeding, or any 
other mechanism, is a built-in evolution- 
ary strategy used by the pathogen to avoid 
the "recombinational load" (break-up of 
favorable MLGs by recombination [34]), 
even when different MLGs are available 
for mating. Inbreeding/ selfing, unisexual 
reproduction can be considered as a way 
to add limited phenotypic and genotypic 
diversity in a clonal population without 
breaking favorable multilocus combina- 
tions [12,18]. Cryptococcus and Giardia 
possess meiosis genes [17,35]. However, 
these genes could be associated with other 
functions than meiosis: "Evolution is 
constantly re-using old genes for new 
purposes" [16]. We have proposed [3] 
that many micropathogens could possess a 
"clonality/ sexuality machinery" rather 
than meiosis genes for switching between 
clonal evolution and recombination to face 
various evolutionary challenges. Selfing 
could be used by them instead of out- 
crossing, even when mating partners are 
available. 

PCE Manifestations in the 
Pathogens under Survey 

We have proposed [1,2] that Giardia 
duodenalis and Cryptococcus neoformans under- 
go PCE. Contrary to Plasmodium [1,11], 
this proposal did not lead to hot contro- 
versy. That clonality is strong or prepon- 
derant is accepted in Cryptococcus [5,17,32] 
and G. duodenalis [4,36,37] and has been 
proposed for Cryptosporidium hominis [7]. As 
a matter of fact, the main PCE manifes- 
tations are easily observable in these 



PLOS Pathogens | www.plospathogens.org 



2 



April 2014 | Volume 10 | Issue 4 | e1 003908 




Figure 1. "Russian doll" model [10]. When population genetic tests are performed with 
appropriate markers (of sufficient resolution) within each of the near-clades, a and b, that 
subdivide the species, A, under study (large tree, left part of the figure), they reveal within these 
near-clades a miniature picture of the whole species, with the two main PCE features, namely 
linkage LD and lesser near-clades (two small trees, a' and b', right part of the figure). This shows 
that PCE obtains also within the near-clades, and that these do not correspond to cryptic, 
potentially panmictic, biological species. 
doi:1 0.1 371 /journal.ppat.1 003908.g001 



pathogens. A few examples among the 
many available include the following: 

• LD: It has been recorded in C. gattii 
[30,38-40], C. neoformans [26,30, 
38,40,41], Pneumocystis jirovecii [25], Cr. 
hominis [7] and G. duodenalis [13,37]. 

• Widespread, stable MLGs: In C. gatti, 
the MLG responsible for the "Van- 
couver epidemics," sequence type (ST) 
39 has been isolated in Vancouver, the 
United States Pacific Coast, and 
Korea, in humans and in animals 
[39,42]. It is identical to the NIH 
444 strain, isolated in 1970 [43]. In C. 
neoformans var. grubii, the MLG ST4 has 
been isolated from 1996 to 2007 in six 
different countries in Africa and Asia. 
ST5 has been isolated from 1983 to 
2009 in four countries in North and 
South America, Europe, and Asia 
[26]. The MLG M5 is distributed in 
North and South America, Asia, 
Europe, and Africa [44]. In Pn. jirovecii, 
identical MLGs have been isolated in 
ten different European hospitals over 9 
years, and in the same patients over 8 
weeks [45]. 



• Near-clading: Near-clades are clearly 
identifiable in G. duodenalis [24,36, 
46,47]. As a matter of fact, the Giardia 
"assemblages" are perfectly equivalent 
to near-clades. They are stable, wide- 
spread, and occur in sympatiy, includ- 
ing in the same host [36]. As we have 
stated [3,10,11], the near-clades are 
not defined by strict phylogenetic 
congruence among loci, but rather, 
by a clear increasing phylogenetic 
signal when more loci are added. 
This is the case for Giardia assemblag- 
es, even if some discrepancies are 
observed among loci [48]. We have 
already called attention [3] to the fact 
that the many terms used by various 
authors to designate pathogen sub- 
specific genetic subdivisions do not 
correspond to true different evolu- 
tionary entities and are rather a 
manifestation of the compartmentali- 
zation in this scientific milieu. We 
propose that the "assemblages," 
"clusters," "clonal groups," and 
many other terms (see Table 1) cor- 
respond to a unique evolutionary 
entity, the near-clade. Using this only 



term instead of the many other ones 
that are now used in this field (see 
Table 1) has two main advantages: (i) 
the term near-clading has a clear 
evolutionary definition and (ii) the 
same evolutionary entity should not 
de designated by a wealth of different, 
imprecise terms. Obviously, this field 
of research calls for urgent semantic 
simplification. Near-clades are identi- 
fied in Cr. hominis [7]. In the "C. 
neoformans complex of species" (CNC), 
the "molecular types" in C. neoformans 
VN I-IV and C. gattii VG I-IV 
[30-32] correspond to clearly delim- 
ited near-clades. The former species 
Pn. carinii proved to be subdivided 
into clearly-differentiated genotypes 
with strong host specificity [49,50]. 
These host-specific genotypes have 
been given the species status, al- 
though (i) host specificity is far from 
absolute and (ii) indications of hy- 
bridization are recorded among them 
[50]. Since some indications for 
clonality are recorded within these 
genotypes [25,45], they might be as 
well considered as mere near-clades. 

• Russian doll patterns: In C. gattii, 
within the cluster (near-clade) VGI, 
clonality obtains, and four lesser sub- 
divisions, namely CI— 4, are observed 
[32]. In VG II, three "clonal groups," 
a, b, and c, are evidenced [31,42]. In 
G. duodenalis, assemblage A shows clear 
subdivisions ("subassemblages"); as- 
semblage B and other assemblages 
may also exhibit subdivisions, although 
they are less ascertained [13,24,47, 
51,52]. 

• Data congruence: In the CNC, the 
near-clades are corroborated by Am- 
plified Fragment Length Polymor- 
phism (AFLP), MLST, PCR finger- 
printing, Random Amplified 
Polymorphic DNA (RAPD), and Re- 
striction Fragment Length Polymor- 
phism (RFLP) [5,8,38]. The Giardia 
assemblages and their subdivisions 
(Russian dolls) are corroborated by 
Multilocus Enzyme Electrophoresis 
(MLEE) and sequence data [36,47]. 

Starving sex versus built-in 
restrained recombination 

Clonality in Cryptosporidium, whose cycle 
includes meiosis, is generally considered 
explainable by lack of outcrossing oppor- 
tunity due to low transmission, or starving 
sex [53]. However, some data do not rule 
out the alternative hypothesis of built-in 
restrained recombination, even if the data 
are less conclusive than for Plasmodium 
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Table 1. The many 
(near-clade). 


different terms used in 


the pathogen population genetic literature to designate the same evolutionary entity 




Viruses 


Bacteria 


Parasitic protozoa 


Fungi 


clades 


clades 


assemblages 


AFLP groups 


clusters 


clonal complexes 


clades 


clades 


genogroups 


clonal lineages 


clonal lineages 


clonal lineages 


genotypes 


clusters 


clones 


clusters 


major genotypes 


genetic groups 


clusters 


clonal groups 


major lineages 


genoclouds 


core subgroups 


genetically distinct subgroups 


phylogenetic groups 


groups 


discrete typing units (DTUs) 


genotypes 




lineages 


genetic groups 


genotypic groups 






genotypes 


groups 






groups 


lineages 






haplotypes 


molecular genotypes 






lesser subgroups 


molecular types 






populations 


phylogenetic species 






subassemblages 


subclusters 






subgroups 


subgenotypes 






subpopulations 


subgroups 






subtypes 


subpopulations 






subtype groups 


varieties 


types 


doi:10.1371/journal.ppat.1003908.t001 



[11]. In Ireland, Cr. paroum is considered 
panmictic due to high transmission rates. 
However, the percentage of multiclonal 
infections is lower in Ireland than in other 
European countries such as Italy, where 
Cr. parmm is not panmictic [54]. In the US 
Midwest, Cr. paroum is overall panmictic. 
However, it is "epidemic" (unstable clon- 
ality [27]) in Minnesota, where the 
transmission is high [55]. The C. gatti 
widespread genotype responsible for the 
Vancouver epidemics is supposed to be the 
result of "same sex mating" between 
identical MLGs [38]. This results in 
"meiotically-derived clones undetectable 
by molecular approaches" [43] . However, 
it cannot be inferred from the data 
whether same-sex mating is the result of 
starving sex or of built-in restrained 
recombination. 

In summary, evidence that the main 
PCE signs obtain is strong in G. duodenalis 
and the CNC. Both present striking 
similarities with many other pathogens, 
for example, Trypanosoma cruzi [10] and 
Toxoplasma gondii [11,56], with significant 
LD; clearly delimited near-clades; ubiqui- 
tous, stable MLGs; and "Russian doll" 
patterns within the near-clades. Both 
Giardia and the CNC also present indica- 
tions for limited recombination or hybrid- 
ization, both within and between near- 
clades [36,47,57], and even between 
species in the case of the CNC [41]. As 



is the case for T. cruzi [58] and Toxoplasma 
[56], patterns of hybridization might be 
complex [41]. The case of Cryptosporidium is 
less clear. This apicomplexa genus is 
known to undergo a sexual phase during 
transmission cycles, as do Plasmodium and 
Toxoplasma. Indications for clonal evolu- 
tion are present in some populations. One 
Cr. hominis MLG is dominant and wide- 
spread in the UK [59]. Some Cr. andersoni 
MLGs are widespread in North America 
and the Czech Republic [60] and in 
several Chinese regions [61]. LD evidence 
is strong in Cr. hominis [7,59,62] and Cr. 
paroum [9,59]. However, the impact of the 
Wahlund effect was not taken into account 
in [7,62] . Near-clading can be suspected in 
Cr. hominis [7], Cr. parvum [13], and Cr. 
muris [61], although the evidence is less 
clear than for Giardia and the CNC. Lastly, 
panmixia was inferred in some popula- 
tions of Cr. parvum [54,55]. It is possible 
that Cryptosporidium population structure is 
similar to that of P. falciparum and P. vivax 
[11], with a continuum between panmixia 
and clonality and the existence of unstable 
near-clades. As for Plasmodium, whether 
clonality is due to starving sex or in-built 
genetic properties should be explored in 
depth. Obviously, the issue of Cryptosporid- 
ium population structure deserves further 
investigation. 

Lastly, some indications for clonality 
were found in Pn. jirovecii [45] . However, 



evidence is far too limited to reach any 
firm conclusions. 



Implications for Molecular 
Epidemiology and Experimental 
Evolution 

LD permits indirect typing; that is to 
say, the characterization of whole geno- 
types with only one gene, or a few genes. 
When LD is doubtful, indirect typing can 
be grossly misleading. This could be the 
case for Cryptosporidium subtyping with the 
unique gp60 gene [63] . If recombination is 
frequent, multilocus typing [64] is not a 
solution since frequent recombination 
makes the MLGs ephemeral. Still, the fact 
remains that the population structure of 
Cryptosporidium is far from being panmictic. 
Even if it is not strong enough to lead to 
stable near-clades, restrained recombina- 
tion in these parasites constitutes a major 
stratification factor that should be taken 
into account in molecular epidemiology 
and all applied studies, as it should in 
Plasmodium [1 1]. 

When the evidence for PCE is clear, 
clonal MLGs and near-clades are conve- 
nient units of analysis for both molecular 
epidemiology and experimental evolution 
[3], thanks to their stability in space and 
time. Near-clades can be characterized by 
specific markers [13]. 
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Taxonomical Implications 

We have called attention to the fact that 
radically dissimilar taxonomical inferences 
could be drawn from similar sets of data 
[65]. Scientists working on the pathogens 
here surveyed have granted considerable 
attention to taxonomical problems and 
species definition and delimitation. 
The conclusions they have reached vary 
considerably. The PCE model allows 
reconsidering these questions. 

Two main species concepts are involved 
in these debates: the biological species 
concept (BSC) [66] and the phylogenetic 
species concept (PSC) [67]. The BSC 
demands two criteria: (i) within the species, 
genetic flow should have no other limita- 
tions than physical obstacles (potential 
panmixia) and (ii) it should be inhibited 
between species by built-in biological 
mechanisms. The PSC stipulates that 
species should correspond to clades, be- 
tween which, by definition, gene flow is 
interrupted. Generally, authors propose a 
mix of genetic and biological characteris- 
tics to define species [68]. Some attempts 
have been made to apply the BSC concept 
to the CNC: experiments have shown that 
crosses within C. gattii VG II are easy, 
while they are difficult between II and III 
[31]. The authors have proposed that II 
and III deserve the status of biological 
species. This is debatable for two reasons: 
(i) experiments tell nothing about the 
frequency of recombination in nature [3] 
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